Parallel Directionally Split Solver Based on Reformulation of Pipelined Thomas Algorithm
نویسنده
چکیده
A very efficient direct solver, known as the Thomas algorithm, is frequently used for the solution of band matrix systems that typically appear in models of mechanics. The processor idle time is a reason for the poor parallelization efficiency of the solvers based on the pipelined parallel Thomas algorithms. In this research an efficient parallel algorithm for 3-D directionally split problems is developed. The proposed algorithm is based on a reformulated version of the pipelined Thomas algorithm that starts the backward step computations immediately after the completion of the forward step computations for the first portion of lines. This algorithm has data available for other computational tasks while processors are idle from the Thomas algorithm. The proposed 3-D directionally split solver is based on the static scheduling of processors where local and non-local, data-dependent and data-independent computations are scheduled while processors are idle. A theoretical model of parallelization efficiency is used to define optimal parameters of the algorithm, to show an asymptotic parallelization penalty and to obtain an optimal cover of a global domain with subdomains. It is shown by computational experiments and by the theoretical model that the proposed algorithm reduces the parallelization penalty about two times over the basic algorithm for the range of the number of processors (subdomains) considered and the number of grid nodes per subdomain. Key words, parallel computing, parallelization model, directionally split methods, pipelined Thomas Algorithm, banded matrices, ADI and FS mcthods Subject classification. Computer Science
منابع مشابه
Efficient implementation of low time complexity and pipelined bit-parallel polynomial basis multiplier over binary finite fields
This paper presents two efficient implementations of fast and pipelined bit-parallel polynomial basis multipliers over GF (2m) by irreducible pentanomials and trinomials. The architecture of the first multiplier is based on a parallel and independent computation of powers of the polynomial variable. In the second structure only even powers of the polynomial variable are used. The par...
متن کاملNumerically Stable Variants of the Communication-hiding Pipelined Conjugate Gradients Algorithm for the Parallel Solution of Large Scale Symmetric Linear Systems
By reducing the number of global synchronization bottlenecks per iteration and hiding communication behind useful computational work, pipelined Krylov subspace methods achieve significantly improved parallel scalability on present-day HPC hardware. However, this typically comes at the cost of a reduced maximal attainable accuracy. This paper presents and compares several stabilized versions of ...
متن کاملOptimal fast digital error correction method of pipelined analog to digital converter with DLMS algorithm
In this paper, convergence rate of digital error correction algorithm in correction of capacitor mismatch error and finite and nonlinear gain of Op-Amp has increased significantly by the use of DLMS, an evolutionary search algorithm. To this end, a 16-bit pipelined analog to digital converter was modeled. The obtained digital model is a FIR filter with 16 adjustable weights. To adjust weights o...
متن کاملA Message-Passing Distributed Memory Parallel Algorithm for a Dual-Code Thin Layer, Parabolized Navier-Stokes Solver
In this study, the results of parallelization of a 3-D dual code (Thin Layer, Parabolized Navier-Stokes solver) for solving supersonic turbulent flow around body and wing-body combinations are presented. As a serial code, TLNS solver is very time consuming and takes a large part of memory due to the iterative and lengthy computations. Also for complicated geometries, an exceeding number of grid...
متن کاملFuzzy Programming for Parallel Machines Scheduling: Minimizing Weighted Tardiness/Earliness and Flow Time through Genetic Algorithm
Appropriate scheduling and sequencing of tasks on machines is one of the basic and significant problems that a shop or a factory manager encounters; this is why in recent decades extensive studies have been done on scheduling issues. One type of scheduling problems is just-in-time (JIT) scheduling and in this area, motivated by JIT manufacturing, this study investigates a mathematical model for...
متن کامل